3 research outputs found

    Person annotation in video sequences

    Get PDF
    In the recent years, the demand for video tools to automatically annotate and classify large audiovisual datasets has increased considerably. One specific task in this field applies to TV broadcast videos, to determine who and when a person appears in a video sequence. This work starts from the base of the ALBAYZIN evaluation series presented in the IberSPEECH-RTVE 2018 in Barcelona, and the purpose of this thesis is trying to improve the results obtained and compare the different face detection and tracking methods. We will evaluate the performance of classic face detection techniques and other techniques based on machine learning on a closed dataset of 34 known people. The rest of characters on the audiovisual document will be labelled as "unknown". We will work with small videos and images of each known character to build his/her model and finally, evaluate the performance of the ALBAYZIN algorithm over a 2h video called "La noche en 24H" whose format is like a news program. We will analyze the results and the type of errors and scenarios we encountered as well as the solutions we propose for each of them if there is any. In this work, We will only focus on a monomodal basis of face recognition and tracking

    Reconstrucción de fondo de escena a partir de secuencias de vídeo

    Full text link
    En este TFG se proponen una serie de algoritmos de inicialización de fondo que se basan tanto información temporal como información espacial. Primero, se estudiará el estado del arte. Después, se procederá a la implementación de dos algoritmos, uno a nivel de bloque y otro a nivel de pixel, donde se han usado medidas de continuidad del bloque/pixel con su vecindario para valorar las posibilidades de los candidatos a fondo: la región espacial del frame actual que se está analizando o la región del background que había hasta ahora. Finalmente, se evaluará el rendimiento de estos dos algoritmos más otros dos que se han considerado relevantes en la literatura respecto a un dataset dividido en distintos tipos de desafíos para cada algoritmo.In this work, several algorithms for background initialization are proposed, all based on temporal and spatial information. First, the state-of-art has been explored. Then, two algorithms have been implemented, one of them uses block-division and the other one pixel-division, where smoothness has been computed to evaluate what spatial region ts best in the new background: the new region of the actual frame or the region that already exists in the background. Finally, these two algorithms and other two that have been already implemented, will be evaluated based on a dataset that is classi ed in several challenges to overcome for each algorithm

    Person annotation in video sequences

    No full text
    In the recent years, the demand for video tools to automatically annotate and classify large audiovisual datasets has increased considerably. One specific task in this field applies to TV broadcast videos, to determine who and when a person appears in a video sequence. This work starts from the base of the ALBAYZIN evaluation series presented in the IberSPEECH-RTVE 2018 in Barcelona, and the purpose of this thesis is trying to improve the results obtained and compare the different face detection and tracking methods. We will evaluate the performance of classic face detection techniques and other techniques based on machine learning on a closed dataset of 34 known people. The rest of characters on the audiovisual document will be labelled as "unknown". We will work with small videos and images of each known character to build his/her model and finally, evaluate the performance of the ALBAYZIN algorithm over a 2h video called "La noche en 24H" whose format is like a news program. We will analyze the results and the type of errors and scenarios we encountered as well as the solutions we propose for each of them if there is any. In this work, We will only focus on a monomodal basis of face recognition and tracking
    corecore